Introduction to Python Tokens
In Python programming, tokens are the smallest individual units that make up a program. When you write Python code, the interpreter breaks it down into these tokens during the lexical analysis phase before parsing and execution.
Tokens can be thought of as the "words" and "punctuation" of the Python language. Understanding these building blocks is essential for writing correct code and debugging issues.
Note: The process of breaking source code into tokens is called tokenization or lexical analysis.
Categories of Python Tokens
Python tokens can be categorized into several types. The main categories are:
1. Keywords
Keywords are reserved words that have special meaning in Python. They define the language's syntax and structure and cannot be used as identifiers (variable names, function names, etc.). There are 35 keyword in Python
import keyword
print(keyword.kwlist)
| Keyword | Description | Example |
|---|---|---|
| if, else, elif | Conditional statements | if x > 5: print("Large") |
| for, while | Loop constructs | for i in range(5): |
| def | Function definition | def my_function(): |
| class | Class definition | class MyClass: |
| import, from | Module importing | import math |
| return | Return from function | return result |
| True, False | Boolean literals | flag = True |
| None | Represents null value | value = None |
| and, or, not | Logical operators | if a and b: |
def calculate(a, b):
if a > b:
return a
else:
return b
2. Identifiers
Identifiers are names given to variables, functions, classes, modules, and other objects. They are user-defined and follow specific rules:
- Can contain letters (a-z, A-Z), digits (0-9), and underscores (_)
- Cannot start with a digit
- Cannot be a keyword
- Are case-sensitive (myVar and myvar are different)
| Valid Identifiers | Invalid Identifiers | Reason |
|---|---|---|
| my_variable | my-variable | Hyphens not allowed |
| user2 | 2user | Cannot start with digit |
| _private | class | class is a keyword |
| FirstName | first name | Spaces not allowed |
my_variable = 10 # variable identifier
def calculate_sum(a, b): # function identifier
return a + b
class MyClass: # class identifier
pass
3. Literals
Literals are constant values that are directly written in the code. They represent fixed values that don't change during program execution.
Types of Literals:
| Literal Type | Description | Examples |
|---|---|---|
| Numeric Literals | Integer, float, and complex numbers | 5, 3.14, 2+3j |
| String Literals | Sequence of characters enclosed in quotes | "hello", 'Python', """multiline string""" |
| Boolean Literals | True or False values | True, False |
| Special Literal | Represents absence of value | None |
| Collection Literals | List, tuple, dictionary, set literals | [1, 2, 3], (1, 2), {1, 2, 3}, {"a": 1} |
age = 25 # Integer literal
price = 19.99 # Float literal
name = "Alice" # String literal
is_valid = True # Boolean literal
scores = [90, 85, 95] # List literal
person = {"name": "Bob", "age": 30} # Dictionary literal
4. Operators
Operators are tokens that perform operations on variables and values. They are used to perform mathematical, relational, logical, and other operations.
| Operator Type | Symbols | Description |
|---|---|---|
| Arithmetic | +, -, *, /, %, **, // | Mathematical operations |
| Comparison | ==, !=, >, <, >=, <= | Compare values |
| Logical | and, or, not | Boolean operations |
| Assignment | =, +=, -=, *=, /=, etc. | Assign values to variables |
| Identity | is, is not | Check if objects are the same |
| Membership | in, not in | Check if value exists in sequence |
| Bitwise | &, |, ^, ~, <<, >> | Operations on bits |
a = 10 + 5 # Arithmetic operator
b = a * 2 # Arithmetic operator
if a > b and a > 0: # Comparison and logical operators
print("a is positive and larger than b")
numbers = [1, 2, 3]
if 2 in numbers: # Membership operator
print("2 is in the list")
5. Punctuators
Punctuators are symbols used to organize code structure, define blocks, separate elements, and indicate relationships between different parts of code.
| Punctuator | Description | Example Usage |
|---|---|---|
| ( ) | Parentheses - function calls, expression grouping | print("hello"), (a + b) * c |
| [ ] | Square brackets - list indexing, list creation | my_list[0], [1, 2, 3] |
| { } | Curly braces - dictionaries, sets | {"a": 1}, {1, 2, 3} |
| : | Colon - start of code block | if x > 5:, def function(): |
| , | Comma - separate items | [1, 2, 3], print(a, b) |
| . | Dot - attribute access | math.sqrt(), obj.method() |
| ; | Semicolon - separate statements on same line | a = 5; b = 10 |
| @ | At symbol - decorators | @decorator |
def add_numbers(a, b): # Parentheses and colon
result = a + b # Assignment operator
return result # Return statement
my_list = [1, 2, 3] # Square brackets and commas
value = my_list[0] # Square brackets for indexing
person = {"name": "Alice", "age": 30} # Curly braces and commas
Conclusion
Understanding Python tokens is fundamental to mastering the language. These building blocks—keywords, identifiers, literals, operators, and punctuators—form the basis of all Python code.
By recognizing and properly using these tokens, you can write more efficient, readable, and error-free code. As you continue your Python journey, you'll develop an intuitive understanding of how these tokens work together to create powerful programs.
Tip: When encountering syntax errors, carefully examine your tokens—often the issue is a missing punctuator, misspelled keyword, or invalid identifier.